Crate xml5ever[−][src]
Expand description
This crate provides a push based XML parser library that adheres to XML5 specification. In other words this library trades well-formedness for error recovery.
The idea behind this, was to minimize number of errors from
tools that generate XML (e.g. S
won’t just return S
as text, but will parse it into S
).
You can check out full specification here.
What this library provides is a solid XML parser that can:
- Parse somewhat erroneous XML input
- Provide support for Numeric character references.
- Provide partial XML namespace support.
- Provide full set of SVG/MathML entities
What isn’t in scope for this library:
- Document Type Definition parsing - this is pretty hard to do right and nowadays, its used
Modules
The BufferQueue
struct and helper types.
Data that is known at compile-time and hard-coded into the binary.
Driver
Types for tag and attribute names, and tree-builder functionality.
Serializer for XML5.
Traits for serializing elements. The serializer expects the data to be xml-like (with a name, and optional children, attrs, text, comments, doctypes, and processing instructions). It uses the visitor pattern, where the serializer and the serializable objects are decoupled and implement their own traits.
This module contains a single struct SmallCharSet
. See its documentation for details.
XML5 tokenizer - converts input into tokens
XML5 tree builder - converts tokens into a tree like structure
Macros
Helper to quickly create an expanded name.
Takes a local name as a string and returns its key in the string cache.
Takes a namespace prefix string and returns its key in a string cache.
Takes a namespace url string and returns its key in a string cache.
Maps the input of namespace_prefix!
to
the output of namespace_url!
.
Create a SmallCharSet
, with each space-separated number stored in the set.
Structs
A tag attribute, e.g. class="test"
in <div class="test" ...>
.
An expanded name, containing the tag and the namespace.
A fully qualified name (with a namespace), used to depict names of tags and attributes.
Represents a set of “small characters”, those with Unicode scalar values less than 64.